Enhancing HMM-based biomedical named entity recognition by studying special phenomena
نویسندگان
چکیده
The purpose of this research is to enhance an HMM-based named entity recognizer in the biomedical domain. First, we analyze the characteristics of biomedical named entities. Then, we propose a rich set of features, including orthographic, morphological, part-of-speech, and semantic trigger features. All these features are integrated via a Hidden Markov Model with back-off modeling. Furthermore, we propose a method for biomedical abbreviation recognition and two methods for cascaded named entity recognition. Evaluation on the GENIA V3.02 and V1.1 shows that our system achieves 66.5 and 62.5 F-measure, respectively, and outperforms the previous best published system by 8.1 F-measure on the same experimental setting. The major contribution of this paper lies in its rich feature set specially designed for biomedical domain and the effective methods for abbreviation and cascaded named entity recognition. To our best knowledge, our system is the first one that copes with the cascaded phenomena.
منابع مشابه
Biomedical Named Entity Recognition: A Poor Knowledge HMM-Based Approach
With a recent quick development of a molecular biology domain it becomes indispensable to promote different resources as databases and ontologies that represent the formal knowledge of the domain. As these resources have to be permanently updated, due to a constant appearance of new data, the Information Extraction (IE) methods become very useful. Named Entity Recognition (NER), that is conside...
متن کاملExploring Deep Knowledge Resources in Biomedical Name Recognition
In this paper, we present a named entity recognition system in the biomedical domain. In order to deal with the special phenomena in the biomedical domain, various evidential features are proposed and integrated through a Hidden Markov Model (HMM). In addition, a Support Vector Machine (SVM) plus sigmoid is proposed to resolve the data sparseness problem in our system. Besides the widely used l...
متن کاملNamed Entity Recognition in Biomedical Texts using an HMM Model
Although there exists a huge number of biomedical texts online, there is a lack of tools good enough to help people get information or knowledge from them. Named Entity Recognition (NER) becomes very important for further processing like information retrieval, information extraction and knowledge discovery. We introduce a Hidden Markov Model (HMM) for NER, with a word similarity-based smoothing...
متن کاملRecognizing Names in Biomedical Texts using Hidden Markov Model and SVM plus Sigmoid
In this paper, we present a named entity recognition system in the biomedical domain, called PowerBioNE. In order to deal with the special phenomena in the biomedical domain, various evidential features are proposed and integrated through a Hidden Markov Model (HMM). In addition, a Support Vector Machine (SVM) plus sigmoid is proposed to resolve the data sparseness problem in our system. Finall...
متن کاملConditional Random Fields vs. Hidden Markov Models in a biomedical Named Entity Recognition task
With a recent quick development of a molecular biology domain the Information Extraction (IE) methods become very useful. Named Entity Recognition (NER), that is considered to be the easiest task of IE, still remains very challenging in molecular biology domain because of the complex structure of biomedical entities and the lack of naming convention. In this paper we apply two popular sequence ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of biomedical informatics
دوره 37 6 شماره
صفحات -
تاریخ انتشار 2004